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Abstract. Activated processes from chemical reactions up to confor- 
mational transitions of large biomolecules are hampered by barriers 
which are overcome only by the input of some free energy of activa- 
tion. Hence, the characteristic and rate-determining barrier regions are 
not sufficiently sampled by usual simulation techniques. Constraints 
on a reaction coordinate r have turned out to be a suitable means to 
explore difficult pathways without changing potential function, energy 
or temperature. For a dense sequence of values of r, the correspond- 
ing sequence of simulations provides a pathway for the process. As 
only one coordinate among thousands is fixed during each simulation, 
the pathway essentially reflects the system's internal dynamics. From 
mean forces the free energy profile can be calculated to obtain reaction 
rates and insight in the reaction mechanism. In the last decade, theo- 
retical tools and computing capacity have been developed to a degree 
where simulations give impressive qualitative insight in the processes 
at quantitative agreement with experiments. Here, we give an intro- 
duction to reaction pathways and coordinates, and develop the theory 
of free energy as the potential of mean force. We clarify the connection 
between mean force and constraint force which is the central quantity 
evaluated, and discuss the mass metric tensor correction. Well-behaved 
coordinates without tensor correction are considered. We discuss the 
theoretical background and practical implementation on the example 
of the reaction coordinate of targeted molecular dynamics simulation. 
Finally, we compare applications of constraint methods and other tech- 
niques developed for the same purpose, and discuss the limits of the 
approach. 
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1 Introduction 

In the last decade, theoretical tools and computing capacity have been developed to a 

degree where simulations give impressive qualitative insight in activated processes at 
quantitative agreement with experiments. On the molecular scale, such processes are 
a typical part of the overall dynamics reaching from association over folding of biolog- 
ical macromolecules down to chemical reactions. The actual interest of theoreticians 
is in predicting equilibrium constants for end states and rates for transitions between 
them because that are the data obtainable from experiments. However, the real time 
scale is often orders of magnitude larger and prevents direct observation of processes 
during simulation. On the other hand, theories like transition state theory are at hand 
for calculating kinetic and equilibrium data from knowledge of the potential energy 
function without solving equations of motion of equivalent methods. The feasibility 
is due to an enormous reduction of the high dimensional configuration space to one 
or a few reaction coordinates and a minimum of features along a reaction path. 
Definition of a reaction coordinate (RC) subject to a sliding constraint seems to be the 
most natural way for collecting data that allow full characterization of the reaction 
path, and was recently declared the "method of choice" after a comparison of related 
methods [1]. Actually, constrained ensembles, sometimes also referred to as "blue- 
moon" ensembles [2], were only considered and treated theoretically in the nineties. 
Currently, they are used in different types of simulations to generate and evaluate 
pathways of activated processes. The large and still increasing number of applica- 
tions is probably due to the increasing computing capacity. Numerous biomolecules 
in aqueous environment have meanwhile been studied by classical simulations with 
constraints, for instance the large systems containing the chapcronin GroEL [3,4] or 
the rotatory F-l-ATPasc [5]. On the other hand, single chemical reaction steps are 
analyzed in much detail by demanding ab-initio calculations [6-8]. QM/MM hybrid 
calculations were performed to treat enzyme catalysis or reactions in solution [9] . 
In a classical picture, activated processes take place in a potential energy landscape 
V{xi...xn) by starting from a local minimum A, climbing up to a saiidlo point called 
the activated state and finally ending in a second minimum B. A minimum-energy 
pathway (MEP) connecting A and B is a curve associated with a monotonically in- 
creasing reaction coordinate r{xi...XN) and a free energy A(r) which has a maximum 
at the activated state situated at r = r'^ if free energy is dominated by energy. For 
the calculation of transition rates, knowledge of A{r^) — A{rA) is crucial, but difficult 
to obtain from simulations because the probability P{r^) to find the system in the 
activated state is minimal along the pathway and extremely small in all interesting 
cases. Like often in the field, the occurrence of rare events hampers the statistical 
evaluation due to insufficient sampling if no particular measures are taken. This is 
due to the relation 

A{r) = -kBT\nQ{r) = -fcsTlnP(r) - fcsTlnQ (1) 

between probability P(r) and the partition function for a given value of r, Q(r). Q 
is the full partition function. We shall develop the theory for Helmholtz free energy 
A, but the final formula will hold for Gibbs free energy G as well when mean values 
are taken at constant pressure and temperature. The brief outline is indicating the 
essential problems that have to be solved when describing an activated process by 
means of molecular dynamics simulations: once that the states A and B are known 
one has (a) to determine pathways, (b) to assign reaction coordinates, and (c) to 
calculate free energy A(r). The order of the first two tasks is not mandatory. The 
focus of this review is on the third task, but we will also touch the other topics since 
they are all interrelated. 
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When probability cannot be evaluated directly from sirmdation data by histogram 
methods, there are essentially two possibilities to make sure sufficient sampling ev- 
erywhere. The first is application of a so-called umbrella potential that restrains the 
system near the hypcrsurface r(xi...XN) = rg,.i of the desired RC [10]. This widespread 
method called umbrella sampling [11] requires postprocessing of data in order to cor- 
rect for the umbrella potential and has been developed further recently [12]. The 
second method tracing back to Carter et al. [13] is based on the idea of a conditional 
ensemble E where the RC takes exactly a given value r{xi...XN) = r^et, which would 
allow unrestricted sampling everywhere along the pathway. The corresponding parti- 
tion function is [14] 



Q(r)= y exp(-^y(q;r))g(q;r)2d^-ig (2) 

i.e. among A'^ generalized coordinates qi...qN one selects r = as RC. The symbols 
{xi...Xn) are reserved for Cartesian coordinates and /3 = l/ksT. The derivative of 
free energy is a mean force 



that is composed of a mean potential force and a second, entropic contribution origi- 
nating from the mass-metric tensor determinant (^(q; r). Although attractive from the 
theoretical point of view, this approach poses serious problems in numerical praxis. 
The conditional ensemble E does not coincide with the constrained ensemble C that 
would be generated by applying the holonomic constraint r{xi...XN) = rget which 
implicitly also entails vanishing velocity r{xi...xj^) = 0. Therefore, it was early pro- 
posed without proof to compute the mean force from the constraint ensemble and to 
replace the derivative of the potential by the negative constraint force [15]. The cru- 
cial step forward was made soon afterwards in a rigorous analysis of the constrained 
case by Mulders et al. [16]. They could show that in this case no mass metric ten- 
sor contribution occurs and the mean force is exactly the negative constraint force, 
which opened a practicable way to compute numerically relevant mean forces and the 
free energy. At the same time it was clear that the full problem was not yet solved, 
and the solution would require considering the metric tensor effect. To be viable, the 
solution should also avoid the partial derivative with respect to the RC in (3). Note 
that its evaluation requires definition of really all coordinates qi...qjsf which is often 
extremely difficult and practically impossible. Den Otter and Briels [17] showed that 
use of g{q; r) can be avoided by considering the Fixman determinant z and the mean 
force can be written 



{f)E^-^ = -{nc + ^OB{z) (4) 

where ^ob{z) depends on and the gradient of z that can be evaluated using Carte- 
sian coordinates without regress to the generalized coordinates qi...qN. In a similar 
way Sprik and Ciccotti [18] were able to derive an equivalent expression with a for- 
mally different correction term <l>sc{z)- Thus, theory had reached a stage where it 
could be applied successfully to the numerical calculation of free energy profiles for 
interesting cases. Generalizations were made to allow for instantaneous instead of 
constraint force with a correction <Pdp(z) [19], multiple constraints often applied in 
molecular dynamics simulations and more-dimensional reaction coordinates [19,20]. 
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Later, theory was reconsidered in order to get a unified concise formulation of the 
confusing seemingly incompatible expressions ^xy {z) found earlier for the necessary 
correction term. In fact, a simple form was determined and proven to coincide with 
previous proposals. The theory part of this review will follow this derivation [21,22] 
which offers a relatively direct approach by starting from the constrained case. 
The methods mentioned so far and compared in [1] can be subsumed as equilibrium 
methods based on equilibrium densities (histogram or umbrella sampling)or equilib- 
rium mean constraint force. Non-equilibrium methods have been popular in some 
other fields of free energy calculation, but for the potential of mean force such an 
approach called 'dynamic umbrella sampling' was only recently designed by Hummer 
and Szabo, see [23]. 

The constraint method is related to others from the wide field of free energy cal- 
culations [14,23], in particular to thermodynamic integration. The preceding consid- 
erations suggest a short-hand notation Hc{<\i\i\ r) for the constrained Hamiltonian 
where r is only a parameter. The mean force given by (/)c = —{5Hc/5r)c determines 
free energy as the potential of mean force. This is a formulation one would expect 
when naively transferring the formalism of thermodynamic integration to the present 
problem replacing the A-parameter by the RC. As a matter of fact, this is possible 
only for the constrained case. In general, a coordinate is more closely connected with 
dynamics than a A-parametcr and needs a suitable treatment. Nevertheless, the sim- 
ple relation dA/dr — {dHc/dr)c holds for some interesting reaction coordinates as 
will be proven below. The review starts in section 2 with the search for pathways and 
properties of reaction coordinates. The choice of appropriate RCs and pathways are 
indispensable steps preceding free energy calculation. Section 3 outlines the essen- 
tial parts of theory. Numerical applications from different fields and comparisons of 
methods are reviewed in the last section 4 followed by a summary. 



2 Pathways and reaction coordinates 

Suitable pathways and reaction coordinates are the ingredients for full characteriza- 
tion of activated processes. Only equilibrium constants can in principle be determined 
from the end states of a process by calculating absolute free energy. The usual way 
is, however, the calculation of relative free energy along a reaction path. 

2.1 Search for pathways 

The idea of stable states of reacting molecules and reaction pathways is influenced by 
pictures of a potential energy surface (PES) spanning over a two-dimensional area, 
i.e. by low-dimensional problems. Stable states are identified witli local minima and 
pathways with minimum-energy paths that connect them while saddle points on the 
way define activated complexes. The MEP is a one-dimensional curve r{qi...qN) of all 
N particle coordinates along which the probability density P{qi...qN^i:r) = P(q; r) 
is maximal at fixed r. When it connects the minima of stable states, it is a reasonable 
reaction coordinate describing a productive reaction pathway (Figure 1A,C). There 
are numerous techniques for determining MEPs, for instance those described in [24] 
that were more and more refined in the course of time. The basic idea is definition of 
an initial trial path that is deformed to become a MEP. A more recent approach is 
transition path sampling [25] for reactive pathways designed to find paths connect- 
ing the end states by means of dynamics simulations. Note that these methods are 
reaction coordinate-free methods. Once that a path is known, one has a sequence of 
not necessarily equidistant points in configuration space that can be assigned values 



Fig. 1. Excamples of potential energy surfaces (top) and free energy profiles (bottom). 
Smooth surfaces and profiles are found at small molecules in vacuo (left). Rugged surfaces 
are typical of marcomolecules like proteins (right). 

ri...r„ of an RC. 



Macromolecules possess a much more complex PES than small molecules (Figure 
1B,D). The glass transition in proteins, for instance, has proven experimentally the 
existence of a vast number of substates separated by barriers which are permanently 
crossed at temperature above 200-250 K [26]. In simulations, this behavior was con- 
firmed by the observation that about every 150 fs at 300 K the protein is crossing a 
barrier towards the next minimum in the PES [27]. The definition of a pathway as 
a MEP makes no more sense at finite temperature as multiple possible trajectories 
connect basins on a rugged energy landscape. The whole conformational space ac- 
cessible from one point at fixed RC now belongs to the same pathway. Moreover, a 
bundle of separated pathways can be available like in small systems. Methods were 
designed to allow for these characteristics and to tackle the enormous costs of simu- 
lating biological macromolecules in their natural environment. The search for a path 
that connects the known folded with an unfolded conformation of a protein [28] based 
on the principle of least action is an example for the extension of ideas developed for 
small systems with new techniques. The methods mentioned so far aimed to deter- 
mine first a pathway and then assign an RC for evaluation. At large molecules, it can 
make sense to start from a predefined, preliminary RC and then to drive the system 
along the RC towards the desired end state or away from a starting structure. This 
is done by application of a series of constraints 

r{t) = r,; i = 0, 1, ...m; tq > ri > ...r™ (5) 

with increasing (or decreasing) RC where the starting structure is situated at r = ro 
and the last structure at r = r™ is a known or undefined structure that is to be 
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determined that way. The constraint can also be time-dependent according to 



r{t) =ro + {vm -ro)t/T 0<t<T (6) 

and increase (or decrease) the RC during a simulation of period T. The first RC 
employed in this way for a macromolecule was the TMD coordinate defined as the 
rms distance of selected atoms from their position in a target structure that will be 
discussed below. The series of RC values was generated by a time-dependent r(t) in 
a MD simulation. It was employed to find a path for a secondary structure transition 
in insulin [29], later applied to an unfolding problem [30] in a similar way and has 
found many applications in large protein conformational transitions, for instance [31]. 
Driving an RC should not be confused with the slow growth technique for free energy 
calculation. It serves merely for generating pathways independent of the method by 
which they are later characterized thermodynamically. 



2.2 Reaction coordinate 

The common idea of elementary chemical reactions or association/dissociation is sug- 
gesting to use a distance between nuclei as a RC [32,33]. At more complex reactions, 
the RC of single steps may not suffice to define a path and can be combined to a new 
RC as just mentioned [34]. For rotation [35] and isomerization [20], dihedral angles are 
appropriate RCs. The references represent only examples of a vast number of cases 
where RCs were employed to describe, simulate or fully evaluate reaction pathways 
of molecular activated processes. 

In many cases, distance-like coordinates are a good choice for defining a prelimi- 
nary RC even at complex situations. A very general form for n particles involved is 




3n \ 1/2 

r(x) = ( ^ ^^{xi - y,f ] =\Y^ nj'ivj - sjf | (7) 

which is a mass-weighted rms distance between a configuration x{t) = {xi...X3n) = 
(ri...r„) and a fixed target or reference configuration y = {yi-.-ysn) = (si...s„). The 
3n Cartesian coordinates belong, for instance, to all or some selected atoms of a 
macromolecule. Mass- weighting of the coordinates as introduced by factors fij = rrij/ 
m* can be important as will be shown below. If, however, the masses are not too 
different and m* is chosen to be the mean mass m* = M/n of a system of total mass 
M, then r is approximately the geometrical distance of x and y in configurational 
space, and the rms distance between the configurations if to* is set equal to M. This 
RC applies to different problems where it has different meaning: 

(a) Distance from, a surface. To describe adsorption of a molecule on a plane x = 0, 
the first sum in (2.3) is restricted to xi which then denotes the center of mass 
distance of the molecule from the surface. 

(b) Two-particle distance. With the restriction to A'' = 1 , r can describe the (mass- 
weighted) distance between two atoms or the centers of mass of two molecules at 
positions x and y, respectively, which is interesting for association/dissociation re- 
actions. When the atoms coincide with the first and last atom of a chain molecule, 
r becomes the end-to-end distance similar to the RC of steered dynamics [36] . 
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(c) Distance between conforni,ers. After superposition of the structure x of a given 
molecule to the reference structure y by first translating and then rotating x, the 
resulting minimum distance measures the structural (rms) distance between the 
conformers. This sort of distance is used in targeted molecular dynamics simula- 
tions (TMD) [29] 

(d) Radius of gyration, r is the radius of gyration when all reference positions rj are 
replaced by the position of the center of mass, R and m* = M, as then 



A similar, but different RC is the dynamic distance [6] defined by 



\ ^ nopsp—l I 

Here the sum runs over all Np non-overlapping pairs of selected atoms. The expression 
= rriimj / {rrii + nij ) is the reduced mass of the respective pair, and m* an arbitrary 
constant mass. This RC shares the favorable properties discussed in the following due 
to appropriate mass-weighting. With the restriction to n = 1, r can describe the 
(mass-weighted) distance between two atoms or the centers of mass of two molecules 
at positions x and y, respectively, which is interesting for association/dissociation 
reactions. When the atoms coincide with the first and last atom of a chain molecule, 
r becomes the end-to-end distance, i.e. the RC of steered dynamics [26]. 



2.2.1 Mechanical properties 

As we have seen, reaction coordinates are employed to drive a system from the start 
to the target state or to measure the constraint force, which is the topic of the next 
chapter. In any case, a force is applied to the system under consideration that can 
cause side effects like translation, rotation or the like. There may be applications 
where such motions are in the focus of the interest. Normally, they represent unde- 
sircd features that can be discarded by using suitably designed RCs. As an example 
for distance-like coordinates, we study the coordinate (8) r{xi...X3n) = ''(ri.-.r„) fixed 
by the constraint 



a (r; ro) = r - ro = I ^ (r^ - Sjf j - ro = (10) 
It results in constraint forces 

The total constraint force 

Ftot^Ef/ = /'E'"/(ri-Si)A (12) 
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is apparently proportional to the difTcrcncc between the centers of mass of the momen- 
tary structure (ri...r„) and the target (si...s„) as = rrij/m*, and vanishes when 
they have been superimposed. A similar argument proves that also the total torque 
^tot ^ X] ^ fj vanishes after superimposing [29,35]. Note that mass- weighting is 
an important ingredient for these favorable properties which prevent rigid-body mo- 
tions to disturb the molecular dynamics. 



2.2.2 Statistical mechanical properties 

It is a well known fact that geometrical constraints can disturb the probability density 
function (PDF) of the unconstrained system in configurational space, P{qi...qN)- If 
a constraint is imposed to r = qat it gives rise to a different PDF Pc{qi...lN-i',r). 
However this is not the case for the above defined distance. As proven by Fixman [37] 
the probabilities are connected by 

Pc{qi ... Qn-i; r) oc ^/det(H)P {qi ... g'jv-igjv) (13) 
where the so-called Fixman determinant is 

^ mi \dxij 

For the distances (8) and (10) one easily derives z = 1/m*, which is a constant. 
This proves that the PDF in configurational space is not changed by introducing the 
constraint r = const. Already the basic formula (1) indicates that also free energy is 
not affected by the constraint if 2; = const, which facilitates free energy calculation 
drastically. 



3 Theory 

In the introduction, the basic ideas and problems on the way to a free energy profile 
were already outlined. This section presents the essential steps of a rigorous derivation 
of the free energy profile. 



3.1 Probability and free energy 

As shown befor{\ the concept of a reaction coordinate r amounts to computing the 
probability P(r) or tlie free energy A(r) for one selected coordinate r = qat of N coor- 
dinates qi...qN with velocities = qi and associated canonical momenta pj. With the 
convention q = {q\...qN)^ for coordinates and v = q for velocities the Hamiltonian 
can be written as 

^^(q, P) = V^(q) + Iv^Av = V (q) + ip^A-^p (15) 

where V denotes the potential energy, and A is the mass-metric tensor that contains 
the masses Wj [37] and defines momenta by p = Av. The corresponding partition 



(14) 
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function is (up to a constant factor) 



Q = J dq^ dp^ eM-P H) (16) 



For a given value of the reaction coordinate, the PDF is given by the reduced parti- 
tion function 



Q{r) = j dq^ dp^ exp(-/3 H) S{r - r') = j dq^''^ dp^-'^dpr exp(-/3 H) (17) 

as P(r) = Q(r)/Q = exp (- (3 A(r))/Q. The associated free energy defined as A(r) = 
—UbT In Q(r) is a function of the RC that is often denoted as the free energy profile. 
By taking the derivative of (18) with respect to the RC one obtains a crucial relation 
between the mean force (/) and free energy (1), 




Apparently, free energy is the potential of mean force (PMF). In practice, it is impos- 
sible to compute numerically P( r) except for small systems where histogram methods 
can be applied. However, it is possible to introduce a constrained system satisfying 
the two conditions qn = f = const and = 0. The restriction of velocity enables 
sampling for arbitrary periods and measuring, for instance, the mean constraint force 
which was early expected by van Gunsteren [14] to deliver free energy and proven 
later by Miilders et al. [16]. The constrained Hamiltonian He satisfying both condi- 
tions is 



H,iq, p; r) = F(q) + iv^av = y(q) + ip^a-^p (19) 

Here q = {qi...qN-i)^ contains only the first N — 1 coordinates and the kinetic en- 
ergy depends only on the restricted set of velocities v = {qi...qN-i)^ or momenta 
p = (pi...pjv-i)^ defined by p = av. a is the mass-metric tensor of the constrained 
system. The free energy Ac{r) = —ksTlnQcir) for this constrained system is given 
by its partition function 

Qe(r) = j dq^-' dp^-' exp(-/3 if,(q, p; r)) (20) 

We now focus on the free energy change dA/dr in (1) that is obtained directly from 
Q{r) according to 



f=-ksTllnQir) (21) 



or, using the constrained quantities, as 



dr dr dr Qc{r) 



(22) 
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It turns out that all of these quantities can be obtained in the constrained system. 
—dAc/dr = {f)c, the mean force measured in the unnaturally constrained system, 
will be treated in the next section. After that, the correction term is calculated. The 
derivative —dA/dr = (/) is the desired mean force which by numerical integration 
yields the free energy profile A{r). 



3.2 Constrained case 

Like above in (14) , the mean force is also in the constrained system given by 



- dA,/dr = if)^ = i-dHjdr)^ (23) 

On the other hand, a mean constraint force {f'^)c occurs and can be measured nu- 
merically as a consequence of the applied constraint. The usual equations of motion 
in Cartesian coordinates are Lagrangian equations of the first kind like 



dV , dr 

^^^^ = -a^+^^ (24) 

where the Lagrange parameter is identical with the constraint force, X — and re- 
sults from the condition r = rget ~ const. Anticipating the result of the following 
detailed analysis by Miilders et al. [6], the mean force turns out to be the negative 
mean constraint force 



{f)^ = {-dHJdr)^ = -{nc (25) 

The mean force in the constrained system is the negative mean constraint force. The 
derivation of (26) is very subtle since it requires switching from Lagrangian to Hamil- 
tonian formalism. One has to consider in more detail the mass-metric tensor A which 
is defined by its matrix elements 

A« = 5]m.^f^ (26) 
^ dqk dqi 

and contains mass-weighted derivatives of a set of Cartesian coordinates with respect 
to the generalized coordinates {qi ...qN-iqw) = (q^^jv) = {c^r). A and its inverse 
can be written as block matrices, 



(27) 



where a is the [N — 1) x (A'^ — 1) matrix used above. Then the free Lagrangian is 



L = I l^q r j A\ci fj — V (q^r) . In contrast, the constrained Lagrangian with only 
N-1 dynamical coordinates reads 

Lc (q, 4; r) = i4^a4 - V (q^; r) (28) 
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and the corresponding Hamiltonian 



He (q, p; r) = ip^ap + V (q^; r) (29) 

with canonical momenta p = dL^/dq contain the RC only as a parameter. Using the 
identity da.~^/dr = — {da/dr) a.~^ , one finds the relation 



dHe (q, p; r) 
dr 



dLc (q, 4; 
dr 



(30) 



On the other hand, Lagrangian equations of the first kind can be formulated using 
L*^^^ = L + Xr where A is the constraint force subject to the condition r = const. L^^> 
is usually expressed by all N Cartesian coordinates (a;i...a;jv) in numerical applica- 
tions. For this argument, however, the above generalized coordinates {qi---qN-i<lN) 
are chosen. We then evaluate the derivates at r = const and f = which yields 



5iJe(q,P;r)^^_ 



dr 



dr 



= A- ^ 
dt 



dt dr 

b(q;r)4 



(31) 



Here, we first employed (31), then inserted the general Lagrangian equations, and 

finally made use of the mass- metric tensor (28). In order to finish the proof, one has 
to remember that the ensemble average of a time derivative must vanish and the 
Lagrange parameter A is identical to the constraint force. Thus, it holds {dHc/dr)^ = 
(A)^ = which finally proves eq. (26). 



3.3 Correction for the unconstrained case 



For calculating the correction term in (23) the partition function Q{r) is transformed 
so that it can be related to the one of the constrained case, Qc{r)- We switch to the 
Hamiltonian formalism and make use of canonical momenta. A decomposition of the 
momentum vector, p = (pi...pjv-iPr)^ = {p^,Pr)^ is chosen in accordance with the 
above introduced block structure of the mass-metric tensor (3.13). This allows one to 
decompose the kinetic energy [17] as 



K = Va-^p 

r (32) 

= -P^A-Ip+ i(p^ + 2-V^P)^^(Pr + 2-V^P) 

Inserting K in the partition function makes it possible to perform the integral over 
Pr and to obtain a very simple result. 
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Q{r) 

= J {dqdp)^-'dprexp (V + ip^a-ip + lipr + z-'y^pfz{pr + z-'y^p))) 
= /a'(?^-i#^-iexp(-/3(t/(q) + ip^a-ip)) z-^cl) ■ const (33) 
= / dq'^-^dp^-^exp{-/3H^{^,p;r)) z-^(l)■ const 
= ^^~2^ Qc{r) ■ const 

Note that i?c(q, p;r) is the Hamiltonian of the constrained system defined in (30), 

and Q(r) the corresponding partition function which is inserted here to transform the 
integral into a thermal average. One finds for the ratio of the partition functions 

Q/Qc = const ■ {z-^'"^) (34) 

where the quantity z is the well known Fixman determinant of the coordinate trans- 
formation used the average of which is a function of the RC. We can now rewrite the 
differential form of the free energy (23) as 

^ = W.-<=.tJ;1„(.-./^)^ (35) 
Integration reveals a simple form which is easily evaluated numerically, 



A{r) = j {Xr)Jr - ksTln (^"'/')^ (36) 

So far, the analysis was made for the case of only one constraint coordinate, the RC r. 
The straightforward extension to multiple constraints is found in other works[20,22,33]. 

According to the fundamental relation (1), the probability for finding the system 
between rand r + rfr is 

P{r)dr = const ■ exp(-A{r)/kBT)dr (37) 

Thus, the free energy profile A(r) reflects the equilibrium probability distribution in 
terms of barriers and wells, and enables application, for instance, of transition state 
theory. However, the interpretation may sometimes be difficult because a Jacobian 
determinant can be contained in P{r) and distort the intuitively expected profile. 

As an example, consider the distance in configuration space defined in (8). It 
is known from applications of TMD that the profile always increases dramatically 
at decreasing distance. When the constraint acts on d cartesian coordinates, d — 1 
degrees of freedom are left and the Jabobian is r'^~^. One might be more interested 
then in the probability g{r) for finding the system in a volume element dV = r'^~^dr 
ad, distance r. Obviously P{r)dr = g{r)r'^~^dr. The corresponding free energy profile 
hence reads 

Ag{r) = {I - d)lnr + A{r) (38) 

This profile can be calculated analytically [35] and turns out to be constant if the 
potential energy landscape is flat. Otherwise it represents the energetics of a transition 
in a reasonable and comprehensive way. 
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4 Applications 

Although the focus of this review is on methodology, a brief glimpse on the very dif- 
ferent applications of the theory is to illustrate the scope and benefit of simulations 
that are fully evaluated with respect to pathways and free energy. 

A few examples of free energy calculations on small molecules were already given 
above for demonstrating the use of simple RCs. The pioneering papers did apply the 
method to an ion pair in solvent [32] and dihedral angles [20,35]. The photoinduced 
proton transfer in the Watson-Crick GC base pair was studied using the dynamic 
distance [8] in Car-Parinello simulations. The simulations revealed the sequence of 
elementary proton transfer steps, which is a typical result of application of RCs that 
comprise many particle coordinates like (8) and (10), and explained the spontaneous 
repair after irradiation by the shape of the free energy profile. Car-Parinello simula- 
tions with constraint are also the background of numerous other free energy calcula- 
tions of chemical reactions [7] . 

The discussion of PES and pathways has adumbrated the problems of applications 
of the full method to macromolecules. It is certainly prohibitive to solve the dock- 
ing problem with pathway search as discussed above, but undocking of ligands and 
dissociation of dimers of proteins are feasible. For instance, the dissociation of two 
superoxide dismutase molecules was studied in much detail and characterized by a 
free energy profile [33]. The dissociation of a phenyl molecule from a insulin complex 
was investigated in a similar way [38]. Most applications of constraints to activated 
processes, however, do without thermodynamic evaluation. They are restricted to the 
characterization of pathways with respect to details which can be verified experimen- 
tally, ranging from rather local [39] to large scale conformational transitions [4,5,40], 
folding/unfolding [31,41] and many other applications of the TMD coordinate (8). 

It has been known for a long time that numerical free energy calculations do not 
converge when performed with a slow-growth technique, but require application of a 
windows scheme [42] . Slow growth of the RC (7) is, however, suited to generate path- 
ways. However, the approximate free energy profile A(r(t)) calculated on the fly from 
the momentary constraint force can indicate a sequence of events like bond ruptures 
and is hence a useful tool to explore details of a pathway. In this regard, it is superior 
to an energy profile E(t) as free energy tends to be much less noisy. 

A window scheme means performing a constraint simulation at n subsequent val- 
ues of the RC (6) for getting reliable mean force values (37) from the constraint 
force, i.e. the Lagrangian parameter. The force tends to fluctuate with considerable 
amplitudes[33,38], which requires control of convergence. The free energy profile is 
obtained then by numerical integration. 

A very clear and detailed comparison of the constraint method with umbrella 
sampling was published by Trzesniak et al. [1] who applied a few variants of both 
methods to the association of two methane molecules in water with the same simula- 
tion period. Of course, results will in general depend on the problem and the choice 
of the RC. The constraint method needs computation of the Fixman determinant in 
cases that deviate from those discussed above in section 2. On the other hand, there 
is no need to find to optimum restraint potential. In summary, the authors arrive at 
the result that both methods yield comparable results, the constraint method being 
the best choice. 
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5 Summary 

Although successful in many applications, the concept of a reaction coordinate is still 
under debate for principle reasons as well as for numerical problems arising eventu- 
ally. In favor of the method, one finds the argument that fixing a single coordinate 
(the RC) during a simulation run at finite temperature leaves a system practically 
full freedom to adapt the constraint (or restraint) by relaxation and equilibration in 
all coordinates but one. The path is a multidimensional entity that allows flexibility 
despite numerous small barriers most of which are probably due to transient hydrogen 
bonds when proteins in aqueous solution are considered. Convergence of the mean 
force can take a long time and must be monitored carefully. 

Nevertheless, other equally broad pathways may exist which are not detected this 
way, but possibly by repeating computation with different starting conditions. This 
is not a shortcoming of the RC approach, but a difficulty inherent to large systems. 
It poses a practical problem that can only be tackled by increasing numerical efforts. 
There seems to be no way to solve the problem mathematically. A second problem 
is the definition of an appropriate RC even for a small system. The optimum RC 
would measure the distance traveled along the underlying MEP, otherwise most of 
the transition takes place in the hyperplane orthogonal to the progress in the RC. A 
reliable indication for this is a sudden jump in the constraint force which is eventually 
observed, for instance, at accompanying proton transfer [36] and can be avoided by 
swapping the RC. It would also possible to compare the traveled distance in config- 
uration space with the one in the RC in order to monitor this kind of behavior. 

Constraints have proven to be a useful tool for simulating activated processes, in 
particular for the final calculation of free energy. Therefore they have been applied to 
an impressive number of problems ranging from local rotations to protein dissocia- 
tion. For generating pathways, constraint methods are particularly suited when large 
molecules or complexes are studied, while for small systems also other methods are 
available. It is in principle possible to replace a constraint everywhere by a restraint 
potential, and there is a tendency to do so because simulation packages allow to add 
a potential, but do not provide a simple way to implement a constraint. Constraints 
possess the advantage that postprocessing is superfluous and performed best in a real- 
istic, stringent comparison. We have also shown that a metric-tensor correction is not 
needed in practice for many reaction coordinates. If necessary, it is easily calculated 
from a simple formula derived recently. 
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